<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Analyze API | ElasticSearch 7.7 权威指南中文版</title>
	<meta name="keywords" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <meta name="description" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
	<script>
	var _link = 'indices-analyze.html';
    </script>
</head>
<body>
<div class="main-container">
    <section id="content">
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/indices-analyze.html" rel="nofollow" target="_blank">https://www.elastic.co/guide/en/elasticsearch/reference/7.7/indices-analyze.html</a>, 原文档版权归 www.elastic.co 所有<br/>本地英文版地址: <a href="../en/indices-analyze.html" rel="nofollow" target="_blank">../en/indices-analyze.html</a></div>
                        <!-- start body -->
                  <div class="page_header">
<strong>重要</strong>: 此版本不会发布额外的bug修复或文档更新。最新信息请参考 <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html" rel="nofollow">当前版本文档</a>。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch Guide [7.7]</a></span>
»
<span class="breadcrumb-link"><a href="rest-apis.html">REST APIs</a></span>
»
<span class="breadcrumb-link"><a href="indices.html">Index APIs</a></span>
»
<span class="breadcrumb-node">Analyze API</span>
</div>
<div class="navheader">
<span class="prev">
<a href="indices-add-alias.html">« Add index alias API</a>
</span>
<span class="next">
<a href="indices-clearcache.html">Clear cache API »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="indices-analyze"></a>Analyze API<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h2>
</div></div></div>

<p>Performs <a class="xref" href="analysis.html" title="Text analysis">analysis</a> on a text string
and returns the resulting tokens.</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "analyzer" : "standard",
  "text" : "Quick Brown Foxes!"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1595.console"></div>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analyze-api-request"></a>Request<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h3>
</div></div></div>
<p><code class="literal">GET /_analyze</code></p>
<p><code class="literal">POST /_analyze</code></p>
<p><code class="literal">GET /&lt;index&gt;/_analyze</code></p>
<p><code class="literal">POST /&lt;index&gt;/_analyze</code></p>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analyze-api-path-params"></a>Path parameters<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">&lt;index&gt;</code>
</span>
</dt>
<dd>
<p>(Optional, string)
Index used to derive the analyzer.</p>
<p>If specified,
the <code class="literal">analyzer</code> or <code class="literal">&lt;field&gt;</code> parameter overrides this value.</p>
<p>If no analyzer or field are specified,
the analyze API uses the default analyzer for the index.</p>
<p>If no index is specified
or the index does not have a default analyzer,
the analyze API uses the <a class="xref" href="analysis-standard-analyzer.html" title="Standard Analyzer">standard analyzer</a>.</p>
</dd>
</dl>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analyze-api-query-params"></a>Query parameters<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">analyzer</code>
</span>
</dt>
<dd>
<p>(Optional, string)
The name of the analyzer that should be applied to the provided <code class="literal">text</code>. This could be a
<a class="xref" href="analysis-analyzers.html" title="Built-in analyzer reference">built-in analyzer</a>, or an analyzer that’s been configured in the index.</p>
<p>If this parameter is not specified,
the analyze API uses the analyzer defined in the field’s mapping.</p>
<p>If no field is specified,
the analyze API uses the default analyzer for the index.</p>
<p>If no index is specified,
or the index does not have a default analyzer,
the analyze API uses the <a class="xref" href="analysis-standard-analyzer.html" title="Standard Analyzer">standard analyzer</a>.</p>
</dd>
<dt>
<span class="term">
<code class="literal">attributes</code>
</span>
</dt>
<dd>
(Optional, array of strings)
Array of token attributes used to filter the output of the <code class="literal">explain</code> parameter.
</dd>
<dt>
<span class="term">
<code class="literal">char_filter</code>
</span>
</dt>
<dd>
(Optional, array of strings)
Array of character filters used to preprocess characters before the tokenizer.
See <a class="xref" href="analysis-charfilters.html" title="Character filters reference"><em>Character filters reference</em></a> for a list of character filters.
</dd>
<dt>
<span class="term">
<code class="literal">explain</code>
</span>
</dt>
<dd>
(Optional, boolean)
If <code class="literal">true</code>, the response includes token attributes and additional details.
Defaults to <code class="literal">false</code>.
<span class="Admonishment Admonishment--experimental">
[<span class="Admonishment-title u-mono">experimental</span>]
<span class="Admonishment-detail">
The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
</span>
</span>
</dd>
<dt>
<span class="term">
<code class="literal">field</code>
</span>
</dt>
<dd>
<p>(Optional, string)
Field used to derive the analyzer.
To use this parameter,
you must specify an index.</p>
<p>If specified,
the <code class="literal">analyzer</code> parameter overrides this value.</p>
<p>If no field is specified,
the analyze API uses the default analyzer for the index.</p>
<p>If no index is specified
or the index does not have a default analyzer,
the analyze API uses the <a class="xref" href="analysis-standard-analyzer.html" title="Standard Analyzer">standard analyzer</a>.</p>
</dd>
<dt>
<span class="term">
<code class="literal">filter</code>
</span>
</dt>
<dd>
(Optional, Array of strings)
Array of token filters used to apply after the tokenizer.
See <a class="xref" href="analysis-tokenfilters.html" title="Token filter reference"><em>Token filter reference</em></a> for a list of token filters.
</dd>
<dt>
<span class="term">
<code class="literal">normalizer</code>
</span>
</dt>
<dd>
(Optional, string)
Normalizer to use to convert text into a single token.
See <a class="xref" href="analysis-normalizers.html" title="Normalizers"><em>Normalizers</em></a> for a list of normalizers.
</dd>
<dt>
<span class="term">
<code class="literal">text</code>
</span>
</dt>
<dd>
(Required, string or array of strings)
Text to analyze.
If an array of strings is provided, it is analyzed as a multi-value field.
</dd>
<dt>
<span class="term">
<code class="literal">tokenizer</code>
</span>
</dt>
<dd>
(Optional, string)
Tokenizer to use to convert text into tokens.
See <a class="xref" href="analysis-tokenizers.html" title="Tokenizer reference"><em>Tokenizer reference</em></a> for a list of tokenizers.
</dd>
</dl>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analyze-api-example"></a>Examples<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-no-index-ex"></a>No index specified<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>You can apply any of the built-in analyzers to the text string without
specifying an index.</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1596.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-text-array-ex"></a>Array of text strings<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>If the <code class="literal">text</code> parameter is provided as array of strings, it is analyzed as a multi-value field.</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "analyzer" : "standard",
  "text" : ["this is a test", "the second text"]
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1597.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-custom-analyzer-ex"></a>Custom analyzer<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>You can use the analyze API to test a custom transient analyzer built from
tokenizers, token filters, and char filters. Token filters use the <code class="literal">filter</code>
parameter:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1598.console"></div>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "char_filter" : ["html_strip"],
  "text" : "this is a &lt;b&gt;test&lt;/b&gt;"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1599.console"></div>
<p>Custom tokenizers, token filters, and character filters can be specified in the request body as follows:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "tokenizer" : "whitespace",
  "filter" : ["lowercase", {"type": "stop", "stopwords": ["a", "is", "this"]}],
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1600.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-specific-index-ex"></a>Specific index<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>You can also run the analyze API against a specific index:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /analyze_sample/_analyze
{
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1601.console"></div>
<p>The above will run an analysis on the "this is a test" text, using the
default index analyzer associated with the <code class="literal">analyze_sample</code> index. An <code class="literal">analyzer</code>
can also be provided to use a different analyzer:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /analyze_sample/_analyze
{
  "analyzer" : "whitespace",
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1602.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-field-ex"></a>Derive analyzer from a field mapping<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>The analyzer can be derived based on a field mapping, for example:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /analyze_sample/_analyze
{
  "field" : "obj1.field1",
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1603.console"></div>
<p>Will cause the analysis to happen based on the analyzer configured in the
mapping for <code class="literal">obj1.field1</code> (and if not, the default index analyzer).</p>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="analyze-api-normalizer-ex"></a>Normalizer<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>A <code class="literal">normalizer</code> can be provided for keyword field with normalizer associated with the <code class="literal">analyze_sample</code> index.</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /analyze_sample/_analyze
{
  "normalizer" : "my_normalizer",
  "text" : "BaR"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1604.console"></div>
<p>Or by building a custom transient normalizer out of token filters and char filters.</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "filter" : ["lowercase"],
  "text" : "BaR"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1605.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="explain-analyze-api"></a>Explain analyze<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>If you want to get more advanced details, set <code class="literal">explain</code> to <code class="literal">true</code> (defaults to <code class="literal">false</code>). It will output all token attributes for each token.
You can filter token attributes you want to output by setting <code class="literal">attributes</code> option.</p>
<div class="note admon">
<div class="icon"></div>
<div class="admon_content">
<p>The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.</p>
</div>
</div>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "tokenizer" : "standard",
  "filter" : ["snowball"],
  "text" : "detailed output",
  "explain" : true,
  "attributes" : ["keyword"] <a id="CO563-1"></a><i class="conum" data-value="1"></i>
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1606.console"></div>
<div class="calloutlist">
<table border="0" summary="Callout list">
<tr>
<td align="left" valign="top" width="5%">
<p><a href="#CO563-1"><i class="conum" data-value="1"></i></a></p>
</td>
<td align="left" valign="top">
<p>Set "keyword" to output "keyword" attribute only</p>
</td>
</tr>
</table>
</div>
<p>The request returns the following result:</p>
<div class="pre_wrapper lang-console-result">
<pre class="programlisting prettyprint lang-console-result">{
  "detail" : {
    "custom_analyzer" : true,
    "charfilters" : [ ],
    "tokenizer" : {
      "name" : "standard",
      "tokens" : [ {
        "token" : "detailed",
        "start_offset" : 0,
        "end_offset" : 8,
        "type" : "&lt;ALPHANUM&gt;",
        "position" : 0
      }, {
        "token" : "output",
        "start_offset" : 9,
        "end_offset" : 15,
        "type" : "&lt;ALPHANUM&gt;",
        "position" : 1
      } ]
    },
    "tokenfilters" : [ {
      "name" : "snowball",
      "tokens" : [ {
        "token" : "detail",
        "start_offset" : 0,
        "end_offset" : 8,
        "type" : "&lt;ALPHANUM&gt;",
        "position" : 0,
        "keyword" : false <a id="CO564-1"></a><i class="conum" data-value="1"></i>
      }, {
        "token" : "output",
        "start_offset" : 9,
        "end_offset" : 15,
        "type" : "&lt;ALPHANUM&gt;",
        "position" : 1,
        "keyword" : false <a id="CO564-2"></a><i class="conum" data-value="1"></i>
      } ]
    } ]
  }
}</pre>
</div>
<div class="calloutlist">
<table border="0" summary="Callout list">
<tr>
<td align="left" valign="top" width="5%">
<p><a href="#CO564-1"><i class="conum" data-value="1"></i></a><a href="#CO564-2"></a></p>
</td>
<td align="left" valign="top">
<p>Output only "keyword" attribute, since specify "attributes" in the request.</p>
</td>
</tr>
</table>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="tokens-limit-settings"></a>Setting a token limit<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/indices/analyze.asciidoc">edit</a>
</h4>
</div></div></div>
<p>Generating excessive amount of tokens may cause a node to run out of memory.
The following setting allows to limit the number of tokens that can be produced:</p>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">index.analyze.max_token_count</code>
</span>
</dt>
<dd>
The maximum number of tokens that can be produced using <code class="literal">_analyze</code> API.
The default value is <code class="literal">10000</code>. If more than this limit of tokens gets
generated, an error will be thrown. The <code class="literal">_analyze</code> endpoint without a specified
index will always use <code class="literal">10000</code> value as a limit. This setting allows you to control
the limit for a specific index:
</dd>
</dl>
</div>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">PUT /analyze_sample
{
  "settings" : {
    "index.analyze.max_token_count" : 20000
  }
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1607.console"></div>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /analyze_sample/_analyze
{
  "text" : "this is a test"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1608.console"></div>
</div>

</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="indices-add-alias.html">« Add index alias API</a>
</span>
<span class="next">
<a href="indices-clearcache.html">Clear cache API »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>